Faster Sublinear Algorithms using Conditional Sampling
نویسندگان
چکیده
A conditional sampling oracle for a probability distribution D returns samples from the conditional distribution of D restricted to a specified subset of the domain. A recent line of work [CFGM13, CRS14] has shown that having access to such a conditional sampling oracle requires only polylogarithmic or even constant number of samples to solve distribution testing problems like identity and uniformity. This significantly improves over the standard sampling model where polynomially many samples are necessary. Inspired by these results, we introduce a computational model based on conditional sampling to develop sublinear algorithms with exponentially faster runtimes compared to standard sublinear algorithms. We focus on geometric optimization problems over points in high dimensional Euclidean space. Access to these points is provided via a conditional sampling oracle that takes as input a succinct representation of a subset of the domain and outputs a uniformly random point in that subset. We study two well studied problems: k-means clustering and estimating the weight of the minimum spanning tree. In contrast to prior algorithms for the classic model, our algorithms have time, space and sample complexity that is polynomial in the dimension and polylogarithmic in the number of points. Finally, we comment on the applicability of the model and compare with existing ones like streaming, parallel and distributed computational models.
منابع مشابه
Faster Algorithms for Testing under Conditional Sampling
There has been considerable recent interest in distribution-tests whose run-time and sample requirements are sublinear in the domain-size k. We study two of the most important tests under the conditional-sampling model where each query specifies a subset S of the domain, and the response is a sample drawn from S according to the underlying distribution. For identity testing, which asks whether ...
متن کاملSublinear Time Orthogonal Tensor Decomposition
A recent work (Wang et. al., NIPS 2015) gives the fastest known algorithms for orthogonal tensor decomposition with provable guarantees. Their algorithm is based on computing sketches of the input tensor, which requires reading the entire input. We show in a number of cases one can achieve the same theoretical guarantees in sublinear time, i.e., even without reading most of the input tensor. In...
متن کاملConstructing Sublinear Expectations on Path Space
We provide a general construction of time-consistent sublinear expectations on the space of continuous paths. It yields the existence of the conditional G-expectation of a Borel-measurable (rather than quasi-continuous) random variable, a generalization of the random Gexpectation, and an optional sampling theorem that holds without exceptional set. Our results also shed light on the inherent li...
متن کاملThe sparse fourier transform : theory & practice
The Fourier transform is one of the most fundamental tools for computing the frequency representation of signals. It plays a central role in signal processing, communications, audio and video compression, medical imaging, genomics, astronomy, as well as many other areas. Because of its widespread use, fast algorithms for computing the Fourier transform can benefit a large number of applications...
متن کاملSublinear Algorithms in the External Memory Model
We initiate the study of sublinear-time algorithms in the external memory model [Vit01]. In this model, the data is stored in blocks of a certain size B, and the algorithm is charged a unit cost for each block access. This model is well-studied, since it reflects the computational issues occurring when the (massive) input is stored on a disk. Since each block access operates on B data elements ...
متن کامل